This package can be used to run the Adversarially Robust Kernel Smoothing (ARKS) algorithm on deep learning tasks. It accompanies the paper "Adversarially Robust Kernel Smoothing".
The implementation is in Python and PyTorch, with current support for the Fashion-MNIST, CIFAR-10, and CelebA datasets. Our code-base also includes implementations of baseline optimization methods: Empirical Risk Minimization (ERM) and the Wasserstein Risk Method (WRM) [1], as well as adversarial attack methods: Projected Gradient Descent (PGD) [2] and Fast-Gradient Sign Method (FGSM) [3].
To install the package for development purposes, run the following steps:
- Create a new environment and install python 3.6, for example using conda:
$ conda create --name arks python=3.6
- Install other dependencies outlined in
setup.py
$ pip install -e .
The Fashion-MNIST and CIFAR-10 datasets are automatically downloaded in the ./data
folder by the
torchvision.datasets
library when running our code.
The CelebA image dataset must be manually configured using the following steps:
- Download the dataset (
archive.zip
) within./data/celeba
from https://www.kaggle.com/jessicali9530/celeba-dataset - From the root of this project, run the following commands:
$ cd data/celeba
$ unzip archive.zip
$ jupyter nbconvert --to notebook --inplace --execute celeba_binary_indices.ipynb
Key files:
./src/arguments.py
contains descriptions of arguments for training and evaluating models using ARKS and other algorithms./src/main/methods.py
contains the implementation of ARKS and other algorithms./src/main/attacks.py
contains the implementation of adversarial attack algorithms./src/utils/model.py
contains the definition of model architectures./src/utils/data.py
contains scripts for loading and preparing data for training and testing./src/main/train.py
contains a script for training and testing models./src/main/evaluate.py
contains a script for evaluating a model trained with a specific algorithm (e.g.--alg-name arks
) on adversarial attacks. The attacks can be generated by attacking a model trained with a different algorithm than the evaluation algorithm (we use--alg-attack erm
in our experiments). All models must first be saved when running./src/main/train.py
by enabling the--save-model
flag. To enable demonstration, we provide example trained models in./models
.
To evaluate a model trained with ARKS on adversarial perturbations, for example Fashion-MNIST
images, run:
$ python src/main/evaluate.py --alg-name arks --alg-attack erm --seed 0 --data fashion_mnist --model-class cnn1 --sigma 0.5
Enable --record-test-images
to view sample adversarial images and the model's predictions (blue corresponds to
correct predictions and red to false predictions; the true label is indicated at the top of each image).
To evaluate a model trained with WRM on adversarial perturbations, for example Fashion-MNIST
images, run:
$ python src/main/evaluate.py --alg-name wrm --alg-attack erm --seed 0 --data fashion_mnist --model-class cnn1 --gamma 1.0
To train and test ARKS on Fashion-MNIST
, run:
$ python src/main/train.py --alg-name arks --data fashion_mnist --model-class cnn1 --lr 0.001 --lr-inner 0.01 --sigma 0.5 --evaluate
To train and test ARKS on CIFAR-10
, run:
$ python src/main/train.py --alg-name arks --data cifar_10 --model-class resnet --lr 0.1 --lr-inner 0.001 --sigma 0.1 --opt-name sgd --decay-lr --activation relu --batch-size 128 --evaluate
To train and test ARKS on CelebA
, run:
$ python src/main/train.py --alg-name arks --data celeba --model-class cnn2 --lr 0.001 --lr-inner 0.002 --sigma 0.2 --activation lrelu --batch-size 128 --evaluate
If you make use of this code in your work, please cite our paper:
@misc{zhu2021adversarially,
title={Adversarially Robust Kernel Smoothing},
author={Jia-Jie Zhu and Christina Kouridi and Yassine Nemmour and Bernhard Schölkopf},
year={2021},
eprint={2102.08474},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
[1] Aman Sinha et al. “Certifying Some Distributional Robustness with Principled Adversarial Training”. In: arXiv:1710.10571 (2017)
[2] Aleksander Madry et al. “Towards Deep Learning Models Resistant to Adversarial Attacks”. In: arXiv:1706.06083 (2019)
[3] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. “Explaining and Harnessing Adversarial Examples”. In: arXiv:1412.6572 (2015)